data science for
nighttime lights

part 1: data processing

ZIP File TAR Ball GitHub


Use R + Python to transform satellite imagery
into economic insight



Credits
Technical Guidance

  • ###
Algorithms
  • Star Ying, Data Scientist, Project Co-Lead
  • Jeff Chen, Chief Data Scientist, Project Co-Lead

Everyday, NOAA satellites collect terabytes of raw data.
Everytime a satellite passes over is an opportunity to improve our understanding of society in near real time. The various imagery captures different bandwidths of light, enabling a wide range of scientific and operational capabilities. In particular, nighttime satellite data can be transformed into approximations of economic activity as well as demographic patterns.

In collaboration with the National Oceanic and Atmospheric Administration (NOAA), the Commerce Data Service has conducted a number of research and development sprints to identify new age applications of satellite imagery to improve economic monitoring.

Among the many satellite missions is the Suomi National Polar-orbiting Partnership (NPP), which carries five state-of-the-art instruments that help understand the Earth in unprecedented detail. Among these instruments is the Visible Infrared Imaging Radiometer Suite (VIIRS) that is designed to provide moderate-resolution, radiometrically accurate images of the entire Earth twice daily. The VIIRS instrument collects a variety of data corresponding to different bandwidths of light. These in turn can be combined through well-tuned remote sensing algorithms to monitor and analyze the a robust set of environmental attributes, including aerosols, clouds, land (fires, temperature), and ocean (ocean color and sea temperature). The immediate scientific contributions have been many and the applications in fields beyond earth science are largely untapped.

VIIRS

The VIIRS Day/Night Bands (DNB) collect levels of nighttime light across the globe at 750-meter resolution, which can be used to proxy human activity and population density among other demographic and economic issues. The data is processed to varying levels of temporal resolution and derived product such as:

For simplicity, we’ll focus the monthly composites. To illustrate the value, the 35 largest US cities by population are mapped and colored in order reveal the intricate detail and gradations in each local urban environment. Within each city, the relative intensity of light is indicative of the relative differences of activity: brighter yellow indicates more population acitivity and darker blue indicates less activity. The developments patterns are distinct in each city, resembling the waterways and roadways that surround and intersect with areas of concentrated light intensity. The central business districts are represented by bright and intense yellow clusters.

Note that each map’s light intensity is scaled specifically to its respective city’s light distribution. Placing all cities on the same light intensity scale enables between city comparisions and reveals relative differences in human activity. As can be seen below, cities like New York, Los Angeles and Chicago are rendered in a near uniform bright yellow, indicating that neighborhoods within those cities are among the brightest in the US. Occassional specks of black indicate areas that are beyond the upper bound of the light distribution.


Seeing is just the beginning.


Beyond visual examination, the VIIRS DNB data can be statistically correlated to economic and social variables using a combination of the Total Nighttime Light (TNL) as well as the radiance (light) distribution. Let’s examine the corresponding radiance distributions of Metropolitan Statistical Areas (MSAs), a common geographic unit of analysis:


The histograms below show the radiance distributions of each MSA in which the count (y-axis) represents the total number of pixels within the geographic footprint and the radiance is logarithm transformed for ease of interpretation (larger logarithm values are exponentially brighter). It is clear that areas of greater population have corresponding greater TNL values as well as fuller and larger radiance distributions that span a greater range of values.

In direct comparison, TNL and population are directly associated with a relatively high correlation coefficient (rho = 0.78).


Correlating with population is just the beginning. There are thousands of data variables that can be combined with VIIRS data to improve the spatial resolution or temporal resolution of currently available economic measures.